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Abstract 

We study the notion of the scaled entropy of a filtration of cr-fields (= decreasing 
sequence of cr-fields) introduced in [6] . We suggest a method for computing this entropy 
for the sequence of cr-fields of pasts of a Markov process determined by a random walk 
over the trajectories of a Bernoulli action of a commutative or nilpotent countable group 
(Theorems 5, 6). Since the scaled entropy is a metric invariant of the filtration, it follows 
that the sequences of cr-fields of pasts of random walks over the trajectories of Bernoulli 
actions of lattices (groups Z d ) are metrically nonisomorphic for different dimensions 
d, and for the same d but different values of the entropy of the Bernoulli scheme. 
We give a brief survey of the metric theory of filtrations, in particular, formulate the 
standardness criterion and describe its connections with the scaled entropy and the 
notion of a tower of measures. 
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1 Introduction: filtrations of cr-fields; standardness; 
classification 

We begin with recalling some general definitions. A Lebesgue, or Lebesgue-Rokhlin, space 
(X, ji) is a space with a probability measure \i that is metrically isomorphic (modO) to 
the union of the interval [0, A), A < 1, with the Lebesgue measure and, possibly at most 
countably many atoms of positive measures that sum to 1 — A. We will be interested mainly 
in Lebesgue spaces with continuous measures. A measurable partition of a Lebesgue space 
(X, /x) is the partition of X into the preimages of points under a measurable map; without loss 
of generality we may assume that this measurable map is a real-valued measurable function 
/ : X — > R. A class of modO coinciding functions determines a class of modO coinciding 
partitions; in what follows, speaking about partitions, we always mean these classes rather 
than individual partitions. Recall that, by Rokhlin's theorem [14], with every measurable 
partition £ = {C a } with elements C a , ct G A, we can associate a canonical system of 
measures, namely, the system of conditional measures {fi c } on the elements C a = C; the 
conditional measures are well defined for almost all elements of £, so that the canonical system 
of measures is well defined modO. The metric classification of modO classes of measurable 
partitions in terms of systems of conditional measures is due to V. A. Rokhlin [15]. 

A measurable partition £ determines, and is determined by, a cr-subfield 21^ of the a-field 
2t(X, fi) of all classes of measurable sets of the space (X, fx), namely, the cr-subfield generated 
by the Lebesgue sets of the corresponding measurable function. The language of <r-subfields 
of 2t(X, fi), traditionally used in the theory of random processes, is equivalent to the more 
geometric language of measurable partitions, which we will mainly use in what follows. The 
correctness of definitions with respect to considering classes of modO coinciding objects is 
usually easy to check (see, e.g., [14, 5]). 

On the set V(X) of classes of measurable partitions (a-fields) there is a natural partial 



2 



ordering. In terms of a-fields, it is the ordering by inclusion, with respect to which V(X) 
is a lattice. 1 We study infinite decreasing sequences of measurable partitions (or infinite 
decreasing sequences of a-fields). In this paper, the term "filtration" is a synonym of the 
term "infinite decreasing sequence of measurable partitions" or "infinite decreasing sequence 
of a -fields." A filtration S = n G N} is called ergodic if the intersection P|£„ of its 

n 

components is the trivial partition u, i.e., P|£ n = lim £ n = v. 2 A general example of a 

filtration is the sequence of a-fields of "pasts" of a one-sided discrete-time random process 
{Vm n < 0}, i.e., the sequence {A n }^L , where A n is the a-field generated by the random 
variables {y& : k < —n}. This filtration is ergodic if and only if the infinite past is trivial (i.e., 
the process is Kolmogorov-regular) . It is of special interest to study the sequences of pasts of 
stationary random processes considered below; in this case, the sequence of partitions is shift- 
invariant (or, in short, stationary). For more details on this theory, see [5] and the references 
therein. Stationary nitrations (i.e., the sequences of pasts of stationary discrete-time or 
continuous-time processes) is one of the two main objects of filtration theory. The second 
class of examples, which is not less important, consists of nitrations arising in trajectory 
theory and the theory of periodic approximations of dynamical systems; here we do not 
consider this class. From the point of view of the theory of stationary random processes, the 
filtration of pasts, its structure and its metric type, is the most important characteristic of 
the process and contains deep information about it. 

Filtrations S = {^ n }^i and S' = {£^}£Li are called (metrically) isomorphic if there 
exists a measure-preserving measurable transformation T satisfying the condition T£ n = £' n 
for all n. The problem of metric classification of infinite ergodic filtrations was posed by 
the first author (mainly in connection with trajectory theory) and has accumulated much 
literature. 

The simplest example of a filtration is the Bernoulli filtration which consists of the pasts 
of a one-sided stationary Bernoulli scheme. It is ergodic, as follows from Kolmogorov's zero 
or one law. A filtration metrically isomorphic to a Bernoulli filtration is called standard; it 
is determined by the type of the one-dimensional distribution of the Bernoulli scheme. The 
Bernoulli scheme with probabilities 1/r, . . . , 1/r, r > 2, r e N, determines a standard r-adic 
filtration; if r = 2, a dyadic filtration; if the one-dimensional distribution of the Bernoulli 
scheme is continuous, a standard continuous filtration. More general nonstationary {r n }-adic 

1 In terms of measurable partitions, "greater" in the sense of this ordering means "finer," so that the 
greatest partition is the partition (denoted by e mod 0) into separate points; and the trivial partition, 
denoted by v mod 0, whose two elements are the empty set and the whole space, is the smallest element 
of the lattice of partitions. This ordering is opposite to that accepted in combinatorics, where the greatest 
element of the lattice is the trivial partition. 

2 The intersection of a- fields is defined literally, but in the language of measurable partitions, the inter- 
section is the measurable hull of the individual (set-theoretic) intersection of partitions. 
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standard filtrations arise from nonstationary Bernoulli schemes. Filtrations S = and 
S' = {^} are finitely isomorphic if their finite fragments {£ n }™ =1 and {£n}™=i are isomorphic 
for any length m. A filtration {£ n } that is finitely isomorphic to a standard r-adic (respec- 
tively, dyadic, continuous, (r n }-adic) filtration is called homogeneous r-adic (respectively, 
dyadic, continuous, {r n }-adic); and a general homogeneous filtration is a filtration that is 
finitely isomorphic to an arbitrary (possibly, nonstationary) standard filtration. 

The original question was whether finitely isomorphic homogeneous ergodic filtrations 
can be nonisomorphic; in other words, whether they can be essentially different "at infinity" 
provided that all their finite fragments are isomorphic? For example, whether there exist 
metrically nonisomorphic ergodic dyadic filtrations? The positive answer to this question, 
and thus the first example of a nonstandard ergodic dyadic sequence, was obtained in [1] (the 
detailed proofs were presented in [3, 5]); this example is the sequence of pasts of a random 
walk over the generators of a Bernoulli action of the free group with two generators. This 
example and its further generalizations showed, in particular, that the metric type of the 
filtration of pasts of a stationary process can be essentially different for different stationary 
processes, and the corresponding classification problem is meaningful. The first method for 
distinguishing filtrations was combinatorial, but in fact it was of entropy nature. It led to 
the definition of the combinatorial (or exponential) entropy of a filtration (see [2] and below). 
This made it possible to present a continuum of pairwise nonisomorphic dyadic filtrations. 
In [16] it was observed that the entropy of the action of the dyadic group J2Z 2 is also 
an invariant of the filtration generated by this action, and this also gives a continuum of 
nonisomorphic filtrations. Moreover, in the dyadic case, the combinatorial entropy and the 
entropy of the action coincide, though their definitions are quite different. The coincidence 
of these entropies even for {r n }-adic sequences holds only for a certain growth of the number 
{r n } of points in the elements of the partitions (see [5, 22]). Besides, entropy of action can be 
defined only for homogeneous filtrations, while combinatorial entropy is defined for general 
filtrations (see below). 

Combinatorial entropy distinguishes only a very narrow class of filtrations, namely, filtra- 
tions with exponential asymptotics of the iterated semimetrics (see below). Later, in [6], the 
class of scaled entropies, which we deal with in this paper, was introduced as a generalization 
of the notion of combinatorial entropy. The definition of scaled entropy is based on intro- 
ducing a scaling for the growth of the entropies of appropriate partitions. In the hierarchy 
of these scalings, combinatorial entropy corresponds to exponential growth, so that it can 
be called exponential entropy. 

All currently known results illustrate the fact, unobvious a priori, that the metric classifi- 
cation of general, or even stationary, filtrations is as difficult as, e.g., the metric classification 
of stationary processes themselves. This is exactly why the problem of finding constructive 
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metric invariants of homogeneous ergodic nitrations arises. Entropy techniques, discussed in 
this paper, seem most useful in this regard. Among other general theorems on nitrations, 
we would like to mention the theorem on lacunary isomorphism and the ensuing notion of 
the principal invariant of nitrations (see [5]); the standardness criterion suggested in [1, 3] 
for distinguishing between standard and nonstandard nitrations (see § 2) is also partly moti- 
vated by this theorem. Scaled entropy is exactly the quantitative characteristic of nitrations 
that naturally arises from the analysis of this criterion. It is determined by a so-called scal- 
ing function (see below). In this paper we formulate theorems on the scaled entropy of the 
nitrations of pasts of random walks over the trajectories of Bernoulli actions of countable 
commutative or nilpotent groups and outline their proofs. Possibly, this method applies 
to groups for which the central limit theorem for random walks holds. For an abelian or 
nilpotent group, the scaling is the power function n d ^ 2 , where d is the weighted dimension 
of the group; in particular, for the lattice Z d it is equal to n d l 2 (Theorems 5, 6). Moreover, 
it turns out that not only the order (scaling), but also the value of the scaled entropy is an 
invariant. Thus the nitrations of pasts of random walks on the lattices Z d are metrically 
nonisomorphic for different dimensions d, and even for the same d but different values of the 
average entropy of the transition probabilities. 

The analysis of the nitrations of pasts of stationary processes provides new characteris- 
tics of one-sided processes. As we will see, already for Kolmogorov-regular processes, i.e., 
processes with trivial infinite past, the nitrations of pasts can have quite various metric 
properties. It is also possible that some invariants of the filtration of pasts of a random 
process can be invariants of the two-sided shift in the space of trajectories of the process. 
Considering the scaled entropy of nitrations arising in problems of periodic approximation 
of automorphisms leads to new invariants, such as the scale of an automorphism and the 
so-called principal invariant, see [4]. 3 

2 Iterated Kantorovich metric, standardness criterion, 
and the tower of measures 

2.1 Admissible metrics and the Kantorovich distance 

In order to construct invariants of nitrations and, in particular, formulate the standardness 
criterion, we need the construction of iterated metrics and the notion of tower of measures. 
But first we give the definitions of admissible metrics on a measure space (admissible triples) 
and the classical Kantorovich metric on measures. 

3 Note that the notion, introduced in [6], of the secondary entropy of a stationary random process is close 
to the notion of scaled entropy; a similar characteristic was also studied in [30]. 
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Definition 1. We say that a semi-metric p on a Lebesgue space (X,p) is admissible (or the 
triple (X,p,p) is admissible) if the following conditions hold: 

1) the semimetric p(x,y), regarded as a function of two variables (i.e., as a function on 
the space (X x X, p x p) ), is measurable; 

2) in the space X there exists a subset X of full measure p that is quasi- compact, in the 
topological sense, with respect to p; this means that the quotient space X of X with respect 
to the partition into classes of points with pairwise zero distances, endowed with the quotient 
metric, is a compact metric space. 

As above, we consider classes of metrics (semimetrics) coinciding almost everywhere 
rather than individual metrics (semimetrics). Denote the set of all (classes of) admissible 
metrics on a given Lebesgue space (X,p) by ^(X, p). The metric compact triple (X,p,p), 
where p is a metric that turns X into a compact metric space and p is a probability Borel 
measure on this space, is an example of an admissible triple. 

Now recall the definition of the Kantorovich metric on the space of measures on a compact 
metric space. 

Given a compact metric space (X, p) , one can define the Kantorovich metric k p on the 
simplex V(X) of probability Borel measures on X (see [12], and also a modern exposition [7]). 
The classical definition of the Kantorovich metric applies only to compact metric spaces, but 
it can be extended, without essential changes, to the case of semimetrics and quasi-compact 
spaces. This definition is as follows: 



here Q ranges over the set of all probability measures on X x X with the given projections, 
Pi and p 2 , to both coordinates, or, in the accepted terminology, with the given marginal 
distributions; and Pi and P 2 are the projections which map measures on X x X to the 
simplices of measures on the corresponding coordinates. 

2.2 Iterated semimetrics associated with a filtration, and the stan- 
dardness criterion 

Now let us apply Kantorovich's construction to constructing metrics on a measure space 
with a given measurable partition. The following procedure, suggested in [5, 3], allows one, 
given an admissible semimetric p and a measurable partition construct a new admissible 
semimetric on the same space (X, p). Let us fix an admissible semimetric p = po on the space 
(X, p) and define a new distance pi(x,y) on (X, p) as the Kantorovich distance between the 




inf{ 



p(x, y) dQ(x, y) \ PiQ = p ± , P 2 Q = p 2 }; 



XxX 
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conditional measures on the elements of £ that contain the given points: 



Pl (x,y) = k p (p c ^\p c ^), 

where C(x),C(y) are the elements of £ that contain x and y, respectively, and p c is the 
conditional measure on an element Cg(. 

Thus, given a semi-metric and a measurable partition, we can define a new semimetric. 
The new distance between points lying in the same element of the partition is equal to zero, 
so that p\ is a semimetric even in the case when p is a metric. However, the quotient of this 
semimetric on the quotient space X/£ is a well-defined metric. 

Now assume that we have a space (X, p) with an admissible semimetric p and a filtration 
" = £o — e - Let us successively apply the above procedure: using p and £1, construct 

a semimetric p±] then, using p 1 and £ 2 , construct a semimetric p 2 , etc. Since the partitions 
decrease, the semimetrics are coherent, in the sense that if the distance between two points 
vanishes with respect to pk, then it vanishes with respect to all subsequent semimetrics. 

Note that the average distance between pairs of points of the space X does not increase 
when passing to the next iteration, i.e., 

c„ = J p n (x,y) < c n _i = J p n ^(x,y)dp(x)dp(y). 

XxX XxX 

Now we can formulate the standardness criterion for a homogeneous filtration. 

Theorem 1 (Standardness criterion, [1, 5, 3]). A homogeneous filtration (a homogeneous 
sequence of partitions) is standard if and only if for every initial semimetric p the mean value 
of the iterated distance between points tends to zero. In our notation, the latter condition 
reads as 

lim / p n (x,y)dp(x)dp(y) = 0, i.e., limc n = 0. 



n^oo 

XxX 



By definition, the condition of this criterion is metrically invariant. Hence the "only if" 
part immediately follows from the fact that it is satisfied for a Bernoulli filtration. The "if" 
part is a deep result. The standardness criterion was formulated in [1] and proved in full 
detail for dyadic nitrations in [3]; the proof was reproduced for r-adic nitrations in [5]. 

For r-adic nitrations, the standardness criterion has a clear combinatorial meaning. In 
this case, it suffices to check it for semimetrics that reduce to finite metric spaces. If we 
start with such a semimetric p (let it reduce to a /c-point space), then the nth iterated 
semimetric is a semimetric on the orbits of the action of the group D n ^ r of automorphisms 
of the homogeneous one-root tree T Ujr of height n and valence r on the space of functions 
on T n r with values in a Appoint set (for instance, k = {0, 1, . . . , k — 1}). In more detail, the 
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group D n r acts by substitutions on the cube k r endowed with the Hamming metric, and 
the iterated metric p n reduces to the space of orbits of this group endowed with the quotient 
metric (the distance between two orbits is the minimum distance between the points of these 
orbits). It is difficult to compute this metric explicitly, but in many cases it is possible to 
check whether or not it degenerates in the limit (i.e., whether or not the space reduces to 
the one-point space). For example, this can be done for r = k = 2. It is this computation 
that led to the first example of a nonstandard filtration. It was carried out in [5, 3] for the 
random walk over the trajectories of a Bernoulli action of the free group and for a Bernoulli 
action of the group of 2-adic integers. 

To prove the criterion for nitrations with continuous conditional measures, one may use 
the same scheme as in [5], making only minimal changes compared with the case of dyadic 
or r-adic nitrations. Later, other proofs were suggested for the continuous case, which have 
essentially the same ideology as in the discrete case, see [19]. A detailed survey of questions 
related to standardness and other properties of nitrations considered by B. Tsirelson is given 
in [20]; the latter paper also contains a rather complete list of references. 

Below we refine the standardness criterion. Namely, we want not only to know whether 
or not the metric degenerates, but also to obtain an estimate on the asymptotics of the 
entropy of the compact metric space with the iterated metric. This is the next step in the 
study of nonstandard nitrations. 

2.3 Tower of measures 

Another formulation of the standardness criterion uses the concept, important in itself, of 
tower of measures [5], which we will briefly reproduce here (see also [2]). First assume 
that we are given a compact metric space (V, r); consider the simplex V(Y) of probability 
Borel measures on (Y,r) endowed with the Kantorovich metric k r ; it is also compact in the 
topology determined by this metric (i.e., in the weak topology). Moreover, there exists an 
isometric embedding % : y — * 5 y of the initial space Y into V(Y). Then we can consider the 
isometric embedding of the simplex V(Y) into the simplex V(V(Y)) = V 2 (Y) of probability 
Borel measures on V(Y), again with the Kantorovich metric kk r , and so on. Thus we 
have an inductive family of compact metric spaces V n = V n (Y) with isometric embeddings 
i n : V n ^ V n+1 , and we can consider the inductive limit of these spaces: 

\immd n (V n (Y),i n ) = V°°(Y) = Tow(Y»; 

the limit space (inductive limit) Tow(Y, r) is called the tower of measures; it is endowed with 
an inductively defined metric, which we denote by f. This definition is of purely topological 
nature; the construction can be generalized to the case of a quasi-compact semimetric space, 
but we will not need such a generalization. The space (Tow(Y,r),f) is a (noncomplete) 
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metric space. Its nature is worth a detailed study; it is of special interest and importance 
to study the properties of its completion with respect to the metric f. Note that it has not 
only the structure of an inductive limit, but also the structure of a projective limit. Indeed, 
since V n (Y), n > 1, is an affine compact space, every measure from V n (Y) has a well-defined 
barycenter, which is a point of V n ~ l {Y). The map that sends a measure to its barycenter is 
an epimorphic affine projection V n {Y) — > V n ~ 1 (Y), n > 1, right inverse to the embedding 
% n -\. This allows one to consider the projective limit of compact spaces limprojV 1 ; the 
limit compact space is exactly the completion of the inductive limit. 4 

2.4 The standardness criterion in terms of the tower of measures 

Let us apply the tower of measures construction to the study of nitrations. Assume that we 
are given a Lebesgue space (X, jj) and an arbitrary measurable function / : X — > [0, 1]. Set 
fo = f and consider the tower of measures Tow([0, l],r), where r is the Euclidean metric 
on [0, 1]. Assume that we are given a filtration {{ n }^i on (X, /i). Let us define a sequence 
of probability measures [v]}-, n — 1,2, . . ., on Tow([0, l],r), where i/j = v n is a measure 
on V n C Tow([0, l],r), as follows. The first measure v l G V\[0, 1]) C Tow([0, l],r) is the 
image of /i under the map f : X — > [0, 1]; thus z/ 1 is a measure on [0, 1], i.e., an element of 
l /1 ([0, 1]). Then we consider the map /i : x t— > fo(^ Cl( - x ^) G V 1 that sends a point rr G X to 
the image of the conditional measure on the element Ci(x) under the map / restricted to 
Ci(x). The second measure z/ 2 is the image of /j, under /i, i.e., a measure on l /1 ([0,l]) (a 
"measure on measures" on [0, 1], or an element of V 2 ([0, 1])). Note that the function /i is 
well defined on the quotient space X/^. Now we consider the map f 2 : X/£i — > l /2 ([0, 1]) 
that sends a point y G X/£i to the image of the conditional measure on the element of £2/^1 
containing y under the function fi restricted to this element. The measure z/ 3 is the image 
of under f 2 , so that it is a measure on ^ 2 ([0, 1]) (or an element of l /3 ([0, 1])), and so on. 
In this way we inductively define a map f n : X — > X/£ n _i — > l^ n ([0, 1]), n = 1, 2, ... , and 
a measure z/ n+1 , which is the image of \i under f n . Thus we have constructed a sequence of 
measures v n , n = 1, 2, . . . , on compact spaces lying in the tower of measures Tow([0, 1], r). 
For more details, see [5, 20]. 5 

In these terms, the standardness criterion asserts that a filtration is standard if and only 
if for every measurable function f = fo, the sequence of measures uj collapses to a delta 

4 It is natural to say that a space with such coherent structures of an inductive limit and a projective 
limit has the structure of an "indoprojective limit." 

5 In [5], the map /„ : X — ► V n ([0, 1]), more exactly, the map that sends the initial function / = / : 
X — > [0, 1] to /„, was called the universal projection of / with respect to the finite decreasing sequence of 
partitions S n = k = 1, 2, . . . , n}; all joint metric invariants of / and S„ can be expressed in terms of 
the functions /„. 
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measure on the completion of the tower of measures (i.e., the weak limit of is the delta 
measure at a point belonging to the completion of the tower of measures). This can be 
expressed by the following formula: 



where the integral is taken over the square of Tow ([0, 1], r) and f is the metric on Tow([0, 1], r) 
defined by the above tower of measures construction applied to the space ([0, l],r) with r 
the Euclidean metric. Compared to the first formulation of the standardness criterion (see 
above), the integration of the iterated metric over the space (X, p) is replaced in this formula 
by the integration over the tower of measures. 

In fact, the standardness takes place if the above condition is fulfilled for at least one 
one-to-one modO measurable function. The fact that a filtration is not standard means, on 
the contrary, that there exists a function for which the sequence of measures v n does not 
degenerate, and this function is not measurable with respect to any coherent sequence of 
independent complements to the filtration (see [3]). This interpretation immediately implies 
that the behavior of the sequence of measures v n (or, in the first interpretation, the sequence 
of metrics p n on (X, p)) contains essential information on the asymptotics of the filtration. In 
fact, the metric type of the filtration is determined by the sequence of measures {z/p}^ on 
Tow(X, p) associated with a one-to-one modO function /. In particular, the asymptotics of 
the ^-entropy of the measures z/ n , n = 1, 2, . . . , on Tow(X, p) is an invariant of the filtration 
and does not depend on the choice of the initial metric p. 

One can study nitrations either in terms of the iterated metrics p n (which will be done 
below) or in (equivalent) terms of the measures u n on the tower of measures. It is the 
interrelation between the filtration and the ^-entropy of the metric measure spaces (X, p, p n ) 
that will be used in the next section for introducing the notion of scaling and scaled entropy. 
In brief, the difference between the two formulations of the standardness criterion can be 
expressed as follows: in the first case, we fix the measure space and iterate the metric; in the 
second case, we fix the compact metric space and the associated tower of measures and vary 
measures on the tower of measures. Apparently, these two approaches are equivalent not 
only in the formulation of the standardness criterion, but also in the analysis of numerical 
characteristics constructed from the metrics p n in the first case and the measures p n in the 
second case. 
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3 Scaled entropy of filtrations: definition, examples 



3.1 Entropy of a metric measure space 

The ordinary definitions of the ^-entropy of a compact metric space and the entropy of an 
atomic measure are well known (see, e.g., [14]). We will need the following characteristic of 
a metric measure space. 

Definition 2. Let e > 0. The e -entropy of a semimetric measure space (X,/i,p), with p an 
admissible semimetric, is the following function of e: 



where A ranges over all discrete measures on X , H(-) is the entropy of a discrete measure, 
and k p is the Kantorovich metric on the space of measures on X. 

Roughly speaking, H £ (X, p, p) is the "entropy" of the continuous measure \i in the semi- 
metric space up to e. 6 

We will use the ^-entropy of a space (X, \x) endowed with an ergodic filtration and the 
associated sequence of admissible iterated semimetrics p n . The analysis of the asymptotic 
behavior of this e-entropy allows us to define and compute the scaled entropy of the filtration. 

3.2 The definition of scaled entropy 

Definition 3. We say that a positive function c(e,n) of two arguments e > 0, n e N is a 
scaling function if it is increasing in n for a fixed e and nonincreasing in e for a fixed n. 
Two scaling functions c(-, •) and c'(-, •) are strictly equivalent if 



If each of these limits is equal to a finite nonzero number, then the scaling functions c(-, •) 
and c'(-, •) are called equivalent. 

Definition 4. The scaled entropy of a filtration = S with respect to a semimetric p 
with scaling function c(-, •) is the number 



where p n is the iterated semimetric associated with the filtration {£„} = S (see the previous 
section). A proper scaling function of S is a scaling function for which the scaled entropy 
/i c (S,p) is different from zero and infinity. 

6 In the literal sense, the entropy of a continuous measure is equal to infinity, and the definition of what 
"up to" means depends on the semimetric p. 



H e (X,p,fi) 



m£{H(X) | k p (\,fi) <e}, 




h c (E, p) = lim sup lim sup 
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The existence of a proper scaling function is a separate problem. In the examples below 
it is easy to prove. 

The following proposition is obvious. 

Proposition 1. 1. The values of the scaled entropy h c (E., p) with respect to a given semi- 
metric p with strictly equivalent scaling functions coincide. 

2. For any filtration and a given semimetric there exists at most one, up to equivalence, 
proper scaling function. 

Note that we may compute the scaled entropy h c (E,p) of a filtration with an arbitrary 
scaling function, but the answer will be different from zero and infinity for at most one 
class of equivalent scalings. Sometimes, for nitrations of a certain type (for example, r-adic) 
one can choose one distinguished normalization scaling (similarly to choosing the base of 
logarithms in the definition of the ordinary entropy). In this case, the common value of the 
scaled entropy with respect to a given semimetric for all proper scaling functions strictly 
equivalent to the normalization scaling is called the value of the scaled entropy with respect 
to the given semimetric. 

The definition of scaled entropy is as follows (see [6]). 

Definition 5. Assume that for some class of filtrations we have chosen a normalization 
scaling. The scaled entropy of a filtration S is the supremum of the normed scaled entropies 
with respect to p over all admissible semimetrics p, i.e., the following finite or infinite number: 

h(E) = sup/i(S,p). 

p 

Recall that p is the initial semimetric from which the iterated semimetrics p n were con- 
structed. 

Theorem 2. The numerical value h(E) of the scaled entropy (if a normalization scaling 
exists) is a metric invariant of the filtration S. 

Although, as follows from this theorem, it suffices to start the construction of iterated 
semimetrics from a metric, nevertheless we consider semimetrics, since it is more convenient 
for calculations and in this way it is easier to approximate the value of the scaled entropy. 

When calculating the scaled entropy of an arbitrary filtration, it is natural to start search- 
ing for a correct scaling with the exponential scaling (see below) and, if the corresponding 
entropy vanishes, turn to another scaling with slower growth, and so on. For a standard 
filtration, the entropy vanishes for any scaling. The converse is also true, because, in view of 
the standardness criterion, the standardness means that the scaled entropy vanishes for the 
scaling c(e, n) = const. This fact is similar to Kushnirenko's theorem [13] on the action of Z, 
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which states that the vanishing of all sequential entropies of an automorphism is equivalent 
to the discreteness of its spectrum. The calculation of the scaled entropy of a filtration is an 
interesting and not very simple problem. It is not known even what scaling functions can 
really appear in the definition of the scaled entropy of nitrations. An interesting question 
concerns the connections of the scaled entropy of, e.g., dyadic sequences to properties of 
group actions. Below we find the scaling and calculate the entropy for the nitrations of pasts 
of several important Markov processes. 

3.3 Exponential (combinatorial) entropy 

The concept of scaled entropy arises, on the one hand, from the analysis of the standardness 
criterion and, on the other hand, from the original notion of the entropy of a filtration 
suggested in [3], which, from the viewpoint of the above definition, is the scaled entropy 
with exponential scaling. We call it the exponential entropy. Let us briefly provide some 
information on this entropy; for definiteness, we restrict ourselves to the case of homogeneous 
nitrations. 

We will consider {r n }-adic homogeneous nitrations (see Sec. 1). Almost every element of 

n 

the partition £ n of an r„-adic filtration consists of Y[ r 'i points; on each element the previous 

i=l 

partitions determine the structure of a (hierarchy) tree. Denote the group of automorphisms 
of this tree by -D{ rfc }, k — 1, . . . ,n. If we fix an arbitrary finite partition 7 and label its 
elements, in an arbitrary way, by some symbols 0,1, ... ,p, then for every n, for every element 

n 

of £ n we can define a sequence of length Yl r i whose coordinates are the symbols 0,1, ... ,p 

i=i 

corresponding to the points from the given element of 7. The definition of this sequence is 
not invariant (it depends on the labelling of points in the element of the partition), but the 
orbit of the action of the group -D{ rfc } on such sequences already depends only on the point 
x and the partition 7 (and, of course, on the fragment of length n of the filtration). Thus 
we obtain a partition 7„ whose every element consists of all points having the same orbit of 
the action of the group -D{ rj .} on sequences consisting of the symbols 0,1, ... ,p. 

Definition 6. The entropy of an r„-adic sequence of partitions S = with respect to a 

finite partition 7 is the number 

M s ;7) = lim — !— if(7„), 

i 

where if (•) is the binary entropy of a finite partition. 

Note that H{j n ) < r n ■ H{j n _i), so that h(E; 7) is bounded. 
Now we can get rid of 7 and define an invariant of the filtration. 
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Definition 7. The exponential entropy of an r„-adic sequence of partitions is the number 

h(E) = sup/i(S;7). 

7 

The constructed invariant can also be called the combinatorial entropy of a homogeneous 
filtration. The scaling depends on the sequence {r n }. The following theorem, which is an 
analog of Kolmogorov's theorem, was proved in [3]. 

Theorem 3. The entropy h(E;~f) is continuous in 7, with respect to the metric #(71,72) = 
(71 1 72) +-HX72I71) 071 the space of finite partitions. 

This allows one to approximate the exponential entropy by the values /i(S; 7) for appro- 
priate partitions 7. An easy consequence of Theorem 3 is the following fact: the exponential 
entropy of a standard filtration is equal to zero. 

For nitrations generated by actions of locally finite groups, it is natural to compare the 
exponential entropy with the entropy of the action (see [16]); they coincide if the growth of 
{r n } is not too fast (see [4, 22]). 

The following theorem from [9] shows that exponential entropy is a special case of scaled 
entropy. 

Theorem 4. The exponential entropy of an {r n }-adic filtration coincides with the scaled 

n 

entropy with the scaling function c(e,n) = Yl r %- 

i=i 

Exponential entropy vanishes for a wide class of nitrations (see the next section), so that 
it does not solve the problem of classification of decreasing sequences of partitions. This is 
demonstrated by a number of examples of nonstandard nitrations with zero exponential en- 
tropy. Scaled entropy allows one to further distinguish metrically nonisomorphic nitrations. 

If the scaled entropy does not vanish for some nonexponential scaling, then the expo- 
nential entropy of such a filtration vanishes. In other words, for an {r n }-adic filtration, the 
exponential scaling with the normalization chosen above is maximal possible up to equiva- 
lence. 

Exponential entropy can be defined not only for {r n }-adic nitrations, but for arbitrary 
nitrations, including those with continuous conditional measures. We do not dwell upon this 
question. 

It is well known that in the ordinary entropy theory, introducing a scaling for measuring 
the rate of growth (in n) of the entropy of the product of n shifts of a partition in the case 
when the entropy of the shift vanishes, does not lead to new invariants. The reason is that 
for every ergodic automorphism with zero entropy, one can choose the initial partition so 
that the growth of the entropies of the product of n shifts of this partition as n tends to 
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infinity will have a given subexponential rate. In the definition of the scaled entropy of a 
filtration, we avoid this difficulty by using admissible metrics. This allows us to construct 
invariants for distinguishing different asymptotics of the growth of the entropies. For a given 
metric, our definition could also be compared with the definition of the topological entropy 
of transformations, but the significant difference is that we then take the supremum over 
admissible metrics. Thus one can conjecture that the idea of scaled entropy can be used also 
for automorphisms with zero entropy. 7 

4 The scaled entropy of filtrations generated by ran- 
dom walks over the trajectories of group actions 

4.1 Standardness and random walks 

As mentioned above, the first example of a nonstandard filtration was the dyadic sequence 
of pasts of the random walk over the trajectories of a Bernoulli action of the free group with 
two generators [5]. The proof consisted in calculating the lengths of orbits of the group of 
automorphisms of the hierarchy on the elements of the partitions. Namely, let v n be the 
measure on the tower of measures constructed from the semimetric corresponding to the 
characteristic function of some set. It was proved that there is no orbit of such a length that 
the measure v n is concentrated in a neighborhood of this orbit. Merely it was shown that 
there exists a measurable set such that the behavior of its characteristic function, regarded 
as a vector of length 2 n with coordinates 0, 1, in the hierarchy of conditional measures 
corresponding to the first n partitions does not stabilize even for the exponential scaling, 
contradicting the standardness of the filtration. The calculations carried out in the paper not 
only proved that the filtration is nonstandard, but also gave a lower bound on the exponential 
entropy in this example. This was the first application of the standardness criterion and a 
motivation for introducing exponential entropy. The same bound applied to the exponential 
entropy of the dyadic filtrations arising from Bernoulli actions of infinite sums of the groups 
Zfc, implying the nonstandardness of these filtrations. In the latter case, the exponential 
entropy numerically coincides with the entropy of the action, though their definitions are 
quite different (see [4, 22]). 

In the class of r-adic filtrations, natural examples of stationary filtrations arise from 
random walks over the trajectories of automorphisms, with equiprobable transitions to one 
of the r points of the trajectory; for example, the so-called (T, T _1 )-endomorphism is the 

7 Usually one fixes the metric and varies the measure (cf. the notion of measure of maximal entropy); but 
we, on the contrary, fix the measure and vary the metric. This idea was repeatedly used in the papers of the 
first author. 
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random walk that moves from a point x to the points Tx and T~ x x with probabilities 1/2; in 
the more general case of an r-adic filtration, the random walk moves with probability 1/r to 
one of the r points of the trajectory of an arbitrary group action. In general, random walks 
(and, more generally, the theory of polymorphisms) provide many interesting examples of 
nitrations. The first example given above also belongs to this class. 

Much later, another example of a random walk over trajectories was given, also using 
the standardness criterion, this time in the positive direction: the filtration of pasts of the 
(T, T^ 1 ) random walk constructed from the rotation T\ of the circle by an irrational angle 
A is standard; this was established first for values of A that can be well approximated by 
rational numbers [31, 29], and then for arbitrary values of A [26]. 

The conjecture of the first author that the filtration of pasts of the random walk over the 
trajectories of a Bernoulli action of Z (the so-called Kalikow endomorphism) is nonstandard 
had been open for a long time. S. Kalikow [12] showed that the (T, T _1 ) endomorphism is not 
even loosely Bernoulli. This was the first example of a natural non-Bernoulli endomorphism. 
The question naturally arose about the type of the filtration of pasts of this endomorphism. 
It is not difficult to check that its exponential entropy vanishes. Finally, in [23] it was proved, 
with the help of the standardness criterion, that the filtration of pasts of this endomorphism 
is indeed nonstandard. In fact, the proof explicitly used scaled entropy, see below. 

In [25, 21], examples are constructed showing that the Bernoulli property of an endo- 
morphism and the standardness of the filtration of pasts are in general position. In other 
words, 

1) Bernoulli automorphisms have generators leading to random processes with nonstan- 
dard nitrations of pasts; such are, for example, the above random walk over the trajectories 
of a Bernoulli action of the free group, and the examples of random walks considered below; 

and 

2) there exist stationary random processes with standard nitrations of pasts such that 
the shifts in the spaces of trajectories of these processes are not isomorphic to a Bernoulli 
shift ([21]). 

4.2 Scaling for random walks over the trajectories of Bernoulli 
group actions 

The next class of examples of nitrations is generated by random walks over the trajectories 
of Bernoulli actions of arbitrary groups. Assume that we are given an arbitrary countable 
group G with finitely many generators g±, . . . , g s , and the Bernoulli action of this group by 
left shifts T g . in the space F of all {0, l}-functions on G endowed with the product measure 
with the factor (1/2,1/2) (i.e., a Bernoulli measure). In what follows, the set of possible 



16 



values of functions in F and its (finite) cardinality are irrelevant, so that for simplicity we 
restrict ourselves to the values and I. Consider the random walk on the space F over the 
trajectories of the action of the group G with equal transition probabilities: 

Prob(/(x) ~ f\gf l x)) = 1 

Thus we consider a generalization of the (T, T -1 ) construction in which the group Z 
is replaced with an arbitrary discrete group G with finitely many generators. In the pre- 
vious notation, X is the space of trajectories of the Markov process with the state space 
F(G, {0; 1}) = 2 G , the Bernoulli measure on F as an invariant measure, and the above tran- 
sition probabilities. This Markov process will be called the random walk over the trajectories 
of the Bernoulli action of G. The sequence of pasts of this Markov process is an r-adic 
filtration with r = 2s. What can we say about this filtration, in particular, about its scaled 
entropy and scaling? In full generality this problem is far from being solved; note that the 
first example of a nonstandard filtration was exactly of this kind, with the free group as G. 
Below we give its solution for lattices and nilpotent groups. 

The paper [24] in fact provides a bound on the scaling function (though the authors of 
[24] do not use the entropy terminology) for the filtration of pasts of Kalikow's (T, T _1 ) 
endomorphism, where T is a Bernoulli automorphism. As shown in [9], the scaling function 
in this case is equivalent to c(e,n) = (n log(^)) 1 / 2 . 

The paper [27] generalized the problem about the (T, T~ x ) automorphism solved by 
S. Kalikow. Namely, the authors of that paper considered the Markov automorphism of a 
simple walk over the trajectories of an action of the lattice Z d . But they were interested not 
in the filtration, but in the type of the Markov shift. They showed that the situation depends 
crucially on the dimension d of the lattice: while in [28] (for d — 1) it was proved that the 
(T, T^ 1 ) automorphism is not even loosely Bernoulli and, all the more so, not Bernoulli, 
for d > 1 it is Bernoulli (though the quality of the natural generator, the type of Bernoulli 
property, depends on d). 

We study the scaled entropy and, in particular, the scaling of the corresponding filtration 
for the groups Z rf ; in this case, it turns out that the entropy properties depend on the 
dimension of the lattice in a quite regular way. 

Theorem 5 (see [11]). For the group G = Z d the proper scaling function of the filtration is 
of polynomial growth: c(e,n) = (nlog(^)) d//2 . 

A further generalization of combinatorial techniques allowed the second author to find 
the proper scaling function for the case of random walks over the trajectories of a Bernoulli 
action of an arbitrary countable nilpotent group G. Recall that the weighted rank (in the 
continuous case, the weighted, or Hausdorff, dimension) of a nilpotent group is the number 
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d = Y^ i{ n i — n i-i)-> where (ni, . . ., n^) is the vector of ranks of the groups with Hi being 
the quotient by the ith element of the lower central series of the group G. 

Theorem 6. For a countable nilpotent group G and the filtration S = {£ n }$?Li of pasts of 
the Markov random walk over the trajectories of a Bernoulli action of G, the proper scaling 
function is equivalent to 

C (e,n) = (nlog(-)) d/2 , 

e 

where d = d{G) is the weighted rank of G. 

Thus for abelian and nilpotent groups the scaling is polynomial; for a walk over the 
trajectories of a Bernoulli action of a free nonabelian group, the scaling is exponential, 
and the entropy of the corresponding filtration is the ordinary (exponential) entropy. The 
question about the scaling for random walks on solvable groups is open. 

4.3 Necessary estimates 

Here we will give only several separate statements that constitute the main part of the proof 
of the theorems on scaling, leaving the detailed exposition till another occasion. The first 
result needed for estimating the entropy of the iterated metric generalizes a result from [23] 
and can be proved in a similar way. 

Theorem 7. There exists a semimetric p on the space X and a number e > such that for 
every polynomial p there exists no such that for all points x except a set of measure e, 

li{y : p n {x, y) < e} < n > n , 

p{n) 

where p n is the iterated semimetric (see §2). 

The second result is a generalization of a technical result from [24, 28]. 

Theorem 8. Let X be the space of trajectories of the Markov process of a random walk over 
the trajectories of a Bernoulli action (see above) of a nilpotent group G with generators Ci, 
% — 1, . . . , n. For every 5 > we can find a subset M C X of measure 1 — 5 and a number 
h such that for every h > h and every pair of trajectories {ui}, {vi} from M , there exists 
n G [h, h 5 ] such that 

Y n \ n 

~~i=\ T^ill < C an d —^\\ IT^ill < C ) 

v 1=1 v 1=1 

where || • || is understood as the number of factors in the minimal representation of a group 
element as a product of the generators gi and their inverses. 



18 



Note that the set M determined by the conditions of the previous theorem can be chosen 
in different ways, but in what follows only the existence of such a set is of importance. 

It will be convenient to slightly modify the construction of the space X and represent 
the shift in this space as a skew product. Namely, the new version of X is X = F(G) x B°°, 
and the Markov shift T in X is a skew product over the one-sided Bernoulli shift. 

Let us choose an initial semimetric p on the space X = F{G) x B°° that is measurable 
with respect to the partition r\ m into cylinders of order m in the sense of the structures of the 
spaces F(G) and B°° . Note that with this choice of a semimetric, the corresponding iterated 
semimetric p n will be measurable with respect to i] m+n ; indeed, the preimage T _1 (r] m ) of rj m 
is measurable with respect to r) m+ i. 

In order to estimate the scaling of the filtration of pasts of the Markov process, we 
estimate the e-entropy of the measure in the space (X,p,p n ). It will be convenient to use 
the combinatorial description of the iterated semimetric similar to that given in Sec. 2.2. 

The iterated semimetric p n can be expressed in terms of the initial metric and the group 
of automorphisms of the tree as follows: 



This explicit expression allows us to estimate the values of the iterated metric using the 
properties of the group D n ^ r . For example, to estimate the entropy from above, it suffices to 
estimate the iterated metric from above. To this end, simply replace the group D n>r by the 
trivial group, immediately obtaining the bound 



which, after applying the central limit theorem for the nilpotent group G (see [18]) gives the 
required upper bound. 

The lower bound on the entropy (and, correspondingly, on the iterated semimetric) is 
slightly more difficult to obtain. The iterated semimetric does not allow for a simple uniform 
bound, but we can estimate the measure of a typical e-ball in the iterated metric p n . Such 
a bound also allows us to estimate the ^-entropy. 

The fact that the semimetric p n is measurable with respect to the cylinder partition 
allows us to translate the problem into purely combinatorial terms: we need to estimate the 
number of cylinders of order n + m lying at a distance less than e from a given cylinder on the 
quotient space X/r] n+m with the reduced semimetric (see Sect. 2.2). Recall that D n ^ r C S r n, 
where S r n is the symmetric group acting by substitutions on the space K n = {0, l} 2 ™ of 
sequences of 0's and l's of length n. Let us fix two typical points (i.e., points from the set 




i=i 




i=i 
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M) x and y lying at a distance at least e. The proof reduces to studying the properties 
of the automorphism a G D n r that realizes the distance between these points. Consider 
the projection x \— > x of the space X to the quotient space X/r] n+m and the embedding 
* : x I— > a;* = = l,...,r n } that sends a point to the collection of its preimages. 

The main property of the automorphism a is that it superposes two vectors, x* G K n and 
y* G K n , of exponential (in n) length r n , and these vectors are in turn determined by vectors, 
x, y G X/i] n+m , of polynomial length (n + m) r . 

It turns out that the action of the minimizing automorphism a "almost" factorizes. This 
means that the diagram 

K K 



Xj rj n+rn > Xj i] n+m 

is almost commutative, in the sense that the (Hamming) distance between the results of 
following two paths in this diagram does not exceed some 5 which tends to zero as n tends 
to infinity: 

d{a(x*),A(x)*) < 5. 

In addition to the fact that the automorphism admits the above "^-factorization," it 
turns out that the constructed quotient map will be almost identical. Further calculations 
show that the number of classes y G X/i] n+m close to x in the iterated semimetric p n is 
subexponential, i.e., inessential from the point of view of the e-entropy. 

Translated by N. V. Tsilevich. 
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